Cross-corpus speech emotion recognition can be a useful transfer learningtechnique to build a robust speech emotion recognition system by leveraginginformation from various speech datasets - cross-language and cross-corpus.However, more research needs to be carried out to understand the effectiveoperating scenarios of cross-corpus speech emotion recognition, especially withthe utilization of the powerful deep learning techniques. In this paper, we usefive different corpora of three different languages to investigate thecross-corpus and cross-language emotion recognition using Deep Belief Networks(DBNs). Experimental results demonstrate that DBNs with generalization poweroffers better accuracy than a discriminative method based on Sparse AutoEncoder and SVM. Results also suggest that using a large number of languagesfor training and using a small fraction of target data in training cansignificantly boost accuracy compared to using the same language for trainingand testing.
展开▼